Comparing Classification Methods for Campaign Management: A Comparison of Logistic Regression, k-Nearest Neighbour, and Decision Tree Induction
نویسندگان
چکیده
Extensive research has been performed to develop appropriate machine learning techniques for different data mining problems, and has led to a proliferation of different learning algorithms. However, previous work has shown that no learner is generally better than another learner. For repeated data mining tasks practitioners have started conducting empirical comparison studies to determine the best technique. Comparing different machine learning methods, however, is a non-trivial task and depends very much on the characteristics of a particular data set and the requirements of the respective business domain. Data mining literature provides little guidance in this respect. Selecting customers which are likely to respond to direct marketing campaigns is an important task in marketing departments, and one where machine learning techniques have been used repeatedly. A systematic comparison of classifier performance can achieve considerable gains in marketing effectiveness. This case study provides an assessment of the predictive performance of different classification methods for SMS campaign management in the telecommunications industry. The analysis is based on a large data set with 165 variables and 10,000 instances. The evaluation of data mining methods for marketing campaigns has special requirements, compared to other application domains. Whereas, typically the overall performance or the error rate are important selection criteria, for campaign management it is more important to select the technique which performs best on the first few quantiles of the overall customer base. In addition, one typically has to deal with very large data sets. This study selects candidate techniques and relevant evaluation criteria for campaign management and provides a guideline for similar comparison studies.
منابع مشابه
A Comparison of Logistic Regression, k-Nearest Neighbor, and Decision Tree Induction for Campaign Management
Extensive research has been performed to develop appropriate machine learning techniques for different data mining problems. However, previous work has shown that no learner is generally better than another learner. Comparing machine learning methods depends very much on the characteristics of a particular data set and the requirements of the respective business domain. Direct marketing is an i...
متن کاملComparing different stopping criteria for fuzzy decision tree induction through IDFID3
Fuzzy Decision Tree (FDT) classifiers combine decision trees with approximate reasoning offered by fuzzy representation to deal with language and measurement uncertainties. When a FDT induction algorithm utilizes stopping criteria for early stopping of the tree's growth, threshold values of stopping criteria will control the number of nodes. Finding a proper threshold value for a stopping crite...
متن کاملA novel hybrid method for vocal fold pathology diagnosis based on russian language
In this paper, first, an initial feature vector for vocal fold pathology diagnosis is proposed. Then, for optimizing the initial feature vector, a genetic algorithm is proposed. Some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different classifiers (ensemble of decision tree, discriminant analysis and K-nearest neig...
متن کاملDevelopment of soft computing models for data mining
The increasing amount and complexity of today's data available in science, business, industry and many other areas creates an urgent need to accelerate discovery of knowledge in large databases . Such data can provide a rich resource for knowledge discovery and decision support. To understand, analyze and eventuall y use this data, a multidisciplinary approach called data mining has been propos...
متن کاملSpam Classification Using Nearest Neighbour Techniques
Spam mail classification and filtering is a commonly investigated problem, yet there has been little research into the application of nearest neighbour classifiers in this field. This paper examines the possibility of using a nearest neighbour algorithm for simple, word based spam mail classification. This approach is compared to a neural network, and decision-tree along with results published ...
متن کامل